Abstract: Frequent patterns are patterns that appear frequently in a data set. Frequent pattern mining searches for recurring relationships in a given data set. It plays an important role in mining associations and correlation analysis among data, is an important data mining task. This work focuses on discovering frequent item sets in data-stream environments which may suffer from data overload. Stream data refer to data that flow into a system in vast volumes, change dynamically and contain multidimensional features. The traditional frequent pattern algorithms are not suitable to find frequent patterns from stream data. This paper proposed a frequent pattern mining algorithm integrate two data overload handling mechanisms. It extracts basic information from streaming data i.e. frequency of data items and keeps as base information. On user requirement the frequent pattern mining algorithm generates frequent item set from base information by using approximate inclusion-exclusion technique to calculate the approximate counts of frequent item sets. Self adaptive sliding window time model has been implemented to process the data stream. When data overload exists, the algorithm chooses data overload mechanism based on the nature of the data. The experimental results showed that the mining algorithm performed well in data overload state and generated frequent item set.
Keywords: Data mining, Stream data, Frequent patterns, Approximate inclusion-exclusion, Adaptive sliding window model, Data overload handling mechanisms.